Photagraphy Wiki

Status: Planning

Brought to you by: mithryn, sgorman2

Structure of a JPEG File

The Structure of a JPEG File

JPEG files are predictable in their structure, with each segment of
information (whether metadata in a header or the image itself)
delimited by well-known hex values called “markers.” The general
structure is as follows:
JPEG Image

Bit Value	Description
0xFF 0xD8	Start of Image, the first bytes of a JPEG file.
0xFF [Segment ID]	a Marker indicating a new segment. Each type of segment has a unique ID.
0XFF 0xD9	End of Image, the last bytes in a file.

Here are some additional JPEG Segment markers

Bit Value	Name	Description
0xFF 0xE0	APP0	Application Marker (in every JPEG)
0xFF 0xDB	DQT	Quantization Table
0xFF 0xC0	SOF0	Start of Frame
0xFF 0xC4	DHT	Define Huffman Table
0xFF 0xDA	SOS	Start of Scan
0XFF 0xED	APP14	Photoshop storage * The one we need *

The information we’re interested in is stored in a segment known as APP14, 0xFF 0xED. The App14 Segment contains the following structure:
App14 Segment

Bit Value	Description
0xFF 0xED	start of APP14 Segment
2 bytes	the segment size, excluding the marker, but including these two bytes.
Photoshop 3.0\x00	A fixed string

8BIM Segments individual fields in the APP14 segment. An 8BIM segment in turn has the following structure:

Bit Value	Description
8BIM	a four byte segment marker (this is, in fact, the string)
Segment Type	two bytes indicating the segment type
Zero padding	4 bytes of 0
Segment size	two bytes, excluding the marker, type, padding, and segment size
Segment data	the actual data of the 8BIM segment

Inside the 8BIM segment’s data are additional subsegments, indicated
as such:

Bit Value	Description
0x1C 0x02	Subsegment marker
Segment type	1 byte indicating the type of marker
Segment size	2 bytes excluding the marker, type, and size
Segment data	the data

The IPTC keyword itself is then stored in one of these sub-segments; specifically, type 0x19. There may be multiple of these keyword subsegments as the standard allows for more than one per image.

A program that manipulates these keywords, then, must do the following:

Parse the header to find if this APP14 segment exists, and if so, if it contains the photoshop, 8BIM, and 0x19 subsegment.
If it does contain 0x19 segments, it must read them and present them to the user, so he knows what keywords have already been assigned.
If the user deletes, changes, or adds keywords, upon a file save it must re-write the entire image file to a new file, with recalculated segment lengths for each sub-segment and parent segment.
- Note that each subsegment has a length, as does each parent
  segment, so each segment must be recalculated.

Wiki: Home

Photagraphy Wiki

Structure of a JPEG File

The Structure of a JPEG File

Related